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Introduction 



School districts in Ohio were given the opportunity to apply for funding to 
support district based reading programs through a request for proposal procedures. An 
effort was made to reliably and equitably score the proposed applications. The grant 
selection process for reading excellence program included two phases. In the first phase, 
applications were selected that meet established criteria and were thought to have 
sufficient merit to be considered for phase two. The criteria that were used to selection in 
the first phase were: a) basic requirements, b) required information, c) significance, d) the 
quality of proposal, e) adequacy of resources, and f) quality of project evaluation. 

Phase two, on the other hand, was an on-site evaluation conducted by a three- 
member review team in order to assess the strengths of the proposed Reading Excellence 
Act (REA) grant application. The purpose of the phase two was: a) verify application 
information; b) assess the capacity and commitment of the districts; and c) ensure likely 
success of the proposal (ODE., 1999). The criteria that were used to evaluate the 
strengths of the proposed REA grant application in the phase two were to: 

a) Coordination of REA activities that impact classroom instruction and focus on 
improving student academic performance; 

b) Connections among professional development for teachers, family literacy 
activities, and extended learning opportunities for students; 

c) Evidence of significally-based reading research in literacy plan; 

d) Commitment of staff members at targeted elementary schools; 

e) Active involvement of districts leadership in REA plan; 
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f) Partnerships between targeted schools and early childhood educators in 
community; 

g) Involvement of parents and other community stakeholders; 

h) Alignment of districts resources and REA proposed budget to meet literacy 
goals identified in grant proposal; 

i) Capacity to implement proposed literacy plan. 

In order to analyze the quality of applications, ordinary analysis procedures were 
not considered appropriate for a fair grant funding procedure. The Rasch Item Response 
Theory was considered a more effective data analysis procedure to identify the quality of 
applications because multiple raters were going to be reading and evaluating each grant 
application. According to Linacre (1993), the many-facet Rasch model has distinct 
advantages over classical data analysis. These advantages include the use of person 
measures rather than raw scores and the adjustment of person measures for facets 
included in the model (Weigle, 1994). Another advantage of many-facet Rasch model is 
the facet “connectedness” that is required if linear “rules” are to be created for each facet 
(Schumacker, 1996). The facet analysis “can provide a framework for obtaining 
objective and fair measurement” (Engelhard, 1992). 

Method of Inquiry and Instrumentation 

The data set include 106 applications from over 3000 buildings in 612 school 

districts. Applications have been assigned randomly to 114 readers to evaluate the quality 
of applications and at least 4 raters were assigned per application. 

The survey that was used to evaluate Grant Applications for Reading Excellence 
Act contains three subscales and 26 questions. The first six items related to the district’s 
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reading tutor program as discussed in the application. The next eight questions outline 
school -based tutor service. Finally, the last eleven questions relate only to the contracted 
tutorial services. An overall evaluation of grant application quality is found in the last 
question, Question (#26). Six-point, likert type, scale was used to evaluate the quality of 
application using the following criteria: 

1 = No evidence; very weak; lowest rating 

2 = Minimal evidence; little support; hardly observable; vague; weak concepts 

3 = Some evidence; some potential for effectiveness; partially developed 

concepts; needs more work 

4 = Enough evidence to indicate a fairly good chance of success; good concepts; 

on the right track 

5 = Strong evidence, easily seen; several success seen; well developed concepts; 

well underway 

6 = Exceptionally strong evidence; outstanding potential; high quality; 

exceptional quality; highest rating. 

Results 

An extension of the Rasch model to include multiple facets (FACETS model) was 
used in analyzing Reading Excellence Act Tutorial Assistance survey. Basically, 
FACETS analysis provides estimates of examine ability, rater severity and item difficulty 
on a common log-linear metric or logit scale (Linacre, 1993). The mathematical 
definition of the three-faceted model with facets of application, rater and item can be 
expressed as follows: 
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Log (P njk / Pnjk- 1 ) = Bn - Di - C j - F k 

Where, 

P njk = Probability that person n on item i is rated by judge j with 
score of k 

B/7 = ability n 

Di = difficulty 1 

C j = severity j 

F k = Challenge k. 

In addition to providing logit estimates of the ability, severity or difficulty of each 
element of each facet, FACETS also provides statistics indicating the relative spread of 
these estimates within each facet. In other words, the analysis provides information about 
the significance of any differences that may exist among elements of a facet; for example 
differences in severity among raters or quality among applications. 

Another important feature of the FACETS analysis is that it provides fit statistics 
for each element, which provide an indication of degree to which each element is 
behaving in a manner that is predicted by the model. In the case of raters, the fit statistics 
are indicators of rater consistency. Thus a detailed picture of the behavior of each rater in 
terms of both severity and consistency can be formed. 

As an overall introduction to the Reading Excellence Act Tutorial Application 
Rating analysis results, Figure 1 shows graphically the measures for applications, raters 
and items for the data. The figure is to be interpreted as follows. The scale along the left 
of the figure represents the logit scale, which is the same for all three facets. Each 
application is represented by the star (*). Applications are ordered so that the most 
quality application at the top, and the application with lowest quality is at the bottom. 
The other facets are ordered so that the most difficult element of each facet is towards the 
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top, and the least difficult elements towards the bottom. In terms of Raters, for example, 
the most severe rater is the uppermost rater in the figure 1. Similarly, the most acceptable 
item is uppermost in the figure 1. The figure 1 thus shows pictorially the differences 
across the different elements of each facet. 

Figure 1. Reading Excellence Act Tutorial Assistance All Facet Summary. 

Vertical = (1*,2*,3*) Yardstick (columns, lines, low, high) = 0,10, -3,1 



In figure 1, applications are ordered with the highest quality applications at the 
top and the lowest quality applications at the bottom. As the figure 1 indicates, 



|Measr|+app I -raters | -items |S.l I 

+ 1+ + + + (6) H 



+ -1 + 



+ H 



+ -2 + 



+ H 



+ — 3 + 



+ (1) H 



I Measr t * = 1 I * =■ 2 I * = 1 I S.l 



6 




application estimates range from a high of about +1 logits to a low of close to -3 logits. 
Looking at the column for applications on the figure 1, we can see that applications 
mainly are low quality than high quality. Of the application clustered around the mean 
(0), 22 out of 104 applications (21%, respectively) above the 0. 

Looking at the column for raters on the figure 1, we can see that raters mainly are 
neither severe nor lenient. Of the raters clustered around the mean (0), 54 out of 114 
readers (47%, resectively) of the raters tend to be more severe than the leinent 

Looking at the column for items on the figure 1, we can see that items mainly are 
neither high nor low quality. Of the raters clustered around the mean (0), 12 out of 26 
items (46%, resectively) have more quality than less quality. 

Application Analysis 

Application’s Quality 



A more detailed analysis about applications is found in table 7.1.1, with the title 
of the Application Measurement Report for the Reading Excellence Act Tutorial 
Assistance. Applications are presented in descending order of quality; in other words, 
Application #3029 and #3026 are the highest quality applications and application #3060 
and #3061 are the lowest quality applications, as was seen in Table 7.1.1. 
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1 . 4 


1.37| 


-2.52 


. 14 


1 0.8 


-1 


0.7 


-1 


! 


3060 


3060 


Obsvd 


Obsvd 


Obsvd 


Fair 




Model 


Infit 


Outfit 








Score 


Count 


Average 


Avrage | 


Measure 


S.E. 


| MnSq 


ZStd 


MnSq 


ZStd 


1 


Num 


app 


-497 . 6 


157 . 7 


3.1 


3.091 


-.51 


.09 


1.0 


-0.4 


1.0 


-0.4 


Mean 


(Count: 106) 


988.5 


269.0 


0.7 


0.731 


. 68 


.01 


0.3 


2.9 


0.3 


2.9 




S.D. 


| 



RMSE (Model) .09 Adj S.D. .67 Separation 7.70 Reliability .98 
Fixed (all same) chi-square: 5760.1 d.f.: 105 significance: .00 

Random (normal) chi-square: 104.4 d.f.: 104 significance: .47 



Note: * Muting 

** Noise 



The FACETS analysis provides a number of indications of the magnitude of the 
differences among elements of a facet: in this case, the quality among applications. These 
are the RMSE, Reliability , Separation Index and Fixed (all same) & Random (normal) 
Chi Square., Inf it and Outfit statistics. 

Root Mean Square Standard Error, RMSE, is produced for all non-extreme 
measures over application. RMSE score, .09, illustrate that application error is very low. 
After application variance has been adjusted for measurement error, Adjusted standard 
deviation found below the 1.0 (.67, respectively). 

The Reliability statistics provided by the FACETS analysis indicates the degree to 
which the analysis reliability distinguishes between different levels of quality among the 
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elements of the facets (in this case, the different applications). For applications, the 
reliability is .98, indicating that the analysis is fairly reliably separating applications into 
different levels of quality. 

The Separation index is the ratio of the corrected standard deviation of elements 
measures (in these cases, applications) to the root mean-square estimation error. If the 
applications were equally quality, the standard deviation of the application’s quality 
estimates should be equal to or smaller than the mean estimation error of the entire data 
set. However, the Application Separation Index is 7.70, indicating that the variance 
among applications is about eight times the error of estimates. 

Finally, the Fixed (all same) Chi-square tests the null hypothesis that all of the 
elements of the facet are equal. The Chi-square of 5760.1 with 105 df. is significant at p = 
.00, indicated that the null hypothesis must be rejected; in other words, the applications’ 
quality are not equal. 

The FACETS analysis also provides two measures of fit, or consistency: the infit 
and the outfit. The infit is the weighted mean-squared residual that is sensitive to 
unexpected responses near the point where decisions are being made. Less then .8 
indicates muting: too little variation, lack of independence. More than 1.3 indicates noise: 
unmodelled excess variation. On the other hand, the outfit is the the unweighted mean- 
squared residual and is sensitive to extreme scores. This fit statistics has the same form as 
infit, but is the conventional mean-square which is more sensitive to outliers. 

In addition to the mean squares, FACETS provides standardized infit and outfit 
statistics, which have an expected mean of 0 and standard deviation of 1. These statistics 
are useful for comparing the elements of a facet with each other, as they show the degree 
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of variability in addition raters’ ratings relative to the amount of variability in the entire 
set. Standardized fit statistics greater than 6 or less than - 6 are generally signs of misfit. 

Applying these information on the Table 7.1.1, we can see that 41 out of 106 
application had very high (>1.4) or low (< 0.6) infit and/or high (>1.4) or low (< 0.6) 
outfit statistics: These Applications are: # 3069 , # 3031 , # 3043 , # 3018 , # 3000 , # 3081 , 
# 3056 , # 3080 , # 3086 , # 3057 , # 3085 , # 3099 , # 3041 , # 3046 , # 3044 , # 3005 , # 3072 , # 3074 , 
# 3052 , # 3011 , # 3089 , # 3022 , # 3106 , # 3007 , # 3076 , # 3096 , # 3071 , # 3082 , # 3039 , # 3004 , 
# 3002 , # 3017 , # 3104 , # 3088 , # 3024 , # 3101 , # 3010 , # 3047 , # 3001 , #3065 & # 3030 . 
(Please look at the table 7.1.1 for more detail). These infit/outfit statistics indicate that 
these applications were not consistent with the estimated quality measures with 114 the 
score showing “noise” and 5 outfit scores “muting”. In most cases the noice or muting is 
not severe. In a couple of instance, the noise is substantial (#3052, #3004 and #3017) and 
replacing an aberrant rater may be a valuable solution. 

Rater Analysis 
Rater Quality 

A more detailed analysis of rater’s behavior is found in table 7.2.1, under the title 
of the Raters Measurement Report for the Reading Excellence Act Tutorial Assistance 
survey results. Raters are presented in descending order of severity; in other words, Rater 
#24 and #61 are the most severe and Rater #144 and #39 are the least severe, as was seen 
in Table 7.2.1. 
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Reading Excellence Act Tutorial Assistance 04-30-2000 01:27:25 
Table 7.2.1 raters Measurement Report (arranged by mN) . 



1 

1 


Obsvd 

Score 


Obsvd 

Count 


Obsvd 

Average 


Fair | 
Avrage | 


Vleasure 


Model 

S . E . 


| Infit 
| MnSq ZStd 


Outfit 
MnSq ZStd 


| Num 


raters 


1 


287 


130 


2.2 


2 . 21 | 


. 75 


.09 


! 


0.9 


0 


0.8 


-1 


! 24 


24 


1 


346 


130 


2 . 7 


2 . 22 | 


. 74 


.09 


1 


1 . 1 


1 


1.2 


1 


1 61 


61 


1 


361 


130 


2.8 


2.331 


. 63 


.08 


! 


0.8 


-1 


0.8 


-1 


! 51 


51 


1 


340 


156 


2.2 


2 . 37 | 


. 60 


.08 


1 


1.3 


2 


1.3 


2 


1 10 


10 


1 


402 


130 


3.1 


2 . 41 | 


.56 


.09 


1 


1.3 


2 


1 . 4 


3 


! 115 


115 


1 


650 


234 


2.8 


2.50 


.48 


.06 


1 


1 . 6 


5 


1.7 


6 


| 27 


27 ** 


I 


332 


130 


2.6 


2.52 


.47 


.09 


1 


0.7 


-3 


0.7 


-2 


| 1 


1 * 


1 


337 


130 


2.6 


2.53 


. 45 


.09 


! 


0.5 


-5 


0.5 


-5 


1 59 


59 * 


1 


493 


182 


2 . 7 


2.55 


.44 


.07 




0.8 


-1 


0.9 


0 


1 63 


63 


I 


288 


104 


2.8 


2 . 56 ! 


.43 


.10 


1 


0 . 6 


-3 


0.7 


-2 


| 62 


62 * 


I 


391 


130 


3.0 


2.60 


.39 


.08 




0.9 


-1 


0.9 


-1 


| 11 


11 


1 


373 


156 


2 . 4 


2 . 61 ! 


.39 


.08 




1 . 1 


0 


1.3 


2 


! 105 


105 


1 


391 


130 


3.0 


2 . 62 ! 


.38 


.08 


1 


0.9 


0 


0.9 


-1 


1 40 


40 


I 


324 


130 


2.5 


2.611 


.38 


.09 




0.5 


-4 


0.6 


-3 


| 118 


118 * 


1 


467 


156 


3.0 


2 . 63 ! 


.37 


.08 


1 


0.8 


-1 


0.8 


-2 


| 17 


17 


1 


442 


156 


2.8 


2.64 


.36 


.08 


! 


0.7 


-3 


0.7 


-3 


| 26 


26 * 


1 


547 


182 


3.0 


2.67 


.33 


.07 


! 


1.2 


2 


1.2 


2 


! 87 


87 


I 


429 


156 


2.8 


2 . 69 ! 


.32 


.08 


1 


2 . 1 


7 


1 . 9 


6 


1 50 


50 ** 


1 


394 


130 


3.0 


2.69 


.31 


.09 


! 


1.0 


0 


1.0 


0 


! 46 


46 


1 


448 


156 


2.9 


2 . 70 ! 


.31 


.08 




0 . 6 


-4 


0.6 


-4 


| 99 


99 * 


I 


397 


156 


2.5 


2.70 ! 


.31 


.08 




0.7 


-3 


0 . 7 


-2 


| 175 


175 * 


1 


549 


182 


3.0 


2.71 ! 


.30 


.07 


! 


0.6 


-5 


0.6 


-5 


1 7 3 


73 * 


1 


256 


78 


3.3 


2 . 71 | 


.30 


. 11 


! 


1.0 


0 


1.0 


0 


! 122 


122 


1 


345 


130 


2.7 


2 . 72 | 


.29 


.09 


1 


0.8 


-1 


0.9 


-1 


1 82 


82 


I 


372 


130 


2.9 


2.731 


.28 


.08 


! 


0.8 


-1 


0.8 


-1 


1 8 


8 


1 


420 


156 


2.7 


2 . 74 | 


.27 


.08 


! 


0.9 


0 


0.9 


0 


! 85 


85 


1 


494 


156 


3.2 


2.78 1 


.24 


.08 




0.9 


-1 


0.9 


-1 


! 6 


6 


1 


401 


156 


2.6 


2.78 1 


.24 


.08 


1 


0 . 7 


-2 


0 . 7 


-2 


| 176 


176 * 


1 


328 


130 


2.5 


2.82 1 


.21 


.09 


1 


1.0 


0 


1 . 1 


0 


I 98 


98 


1 


363 


130 


2.8 


2 . 84 | 


.19 


.08 




1.5 


3 


1.5 


3 


1 12 


12 ** 


1 


378 


130 


2.9 


2.851 


.18 


.08 


1 


0.9 


-1 


0.9 


-1 


! 126 


126 


1 


963 


312 


3.1 


2.851 


.18 


.05 




1 . 6 


6 


1 . 6 


6 


| 146 


146 ** 


1 


738 


286 


2.6 


2 . 87 | 


. 17 


.06 


1 


1.0 


0 


1.0 


0 


! 28 


28 


1 


353 


104 


3.4 


2.861 


. 17 


.09 


1 


0.8 


-1 


0.8 


-1 


! 53 


53 


1 


534 


156 


3.4 


2.861 


. 17 


.08 


I 


0.5 


-6 


0.5 


-6 


1 57 


57 * 


1 


344 


130 


2.6 


2.861 


. 17 


.09 


1 


0.7 


-3 


0 . 7 


-3 


| 102 


102 * 


1 


964 


312 


3.1 


2 . 87 | 


.16 


.06 


1 


1 . 1 


0 


1.0 


0 


! 16 


16 


1 


884 


286 


3.1 


2.87 1 


.16 


.06 


1 


1.3 


3 


1 . 4 


3 


1 45 


45 


1 


327 


130 


2.5 


2.891 


. 15 


.09 


I 


1.2 


1 


1.2 


1 


1 19 


19 


1 


423 


130 


3.3 


2.891 


. 15 


.09 


1 


0.8 


-1 


0.8 


-1 


1 65 


65 


1 


407 


130 


3.1 


2.901 


. 14 


.09 


1 


1.0 


0 


0.9 


0 


! 58 


58 


1 


363 


130 


2.8 


2.891 


. 14 


.08 


1 


0.9 


-1 


0.9 


-1 


! 116 


116 


1 


390 


156 


2.5 


2.911 


.13 


.08 


1 


1 . 4 


3 


1 . 4 


3 


1 71 


71 * 


1 


499 


156 


3.2 


2.91 ! 


.13 


.08 


1 


1 . 1 


0 


1 . 1 


0 


! 96 


96 


1 


418 


156 


2.7 


2 . 91 | 


.13 


.08 


1 


0.8 


-1 


0.9 


0 


1 100 


100 


1 


373 


130 


2.9 


2.931 


. 12 


.09 


1 


1 . 7 


5 


1 . 7 


5 


| 88 


88 * 


1 


422 


130 


3.2 


2.92 1 


. 12 


.09 


1 


0.9 


0 


0.9 


0 


1 129 


129 


1 


560 


182 


3.1 


2 . 93 ! 


.11 


.07 


! 


1.0 


0 


1.0 


0 


1 30 


30 


1 


365 


130 


2.8 


2 . 94 | 


.10 


.08 


1 


0 . 5 


-5 


0.5 


-5 


| 125 


125 * 


1 


440 


130 


3.4 


2 . 97 | 


.08 


.08 


1 


1.2 


1 


1.2 


1 


1 52 


52 


1 


356 


130 


2.7 


2 . 97 | 


.08 


.09 


1 


1 . 7 


5 


1 . 7 


4 


1 54 


54 ** 


1 


436 


156 


2.8 


2 . 97 | 


.08 


.08 


1 


1 . 1 


0 


1 . 1 


0 


1 79 


79 


1 


338 


104 


3.3 


2.98 1 


.08 


.09 


1 


0 . 6 


-3 


0.6 


-3 


| 109 


109 * 


1 


378 


130 


2.9 


2.98 1 


.07 


.09 


1 


0.9 


-1 


0.9 


0 


1 80 


80 


1 


418 


156 


2.7 


2.98 1 


.07 


.08 


1 


0.9 


-1 


0.9 


-1 


1 170 


170 


1 


458 


156 


2.9 


2.991 


.07 


.08 


1 


1.3 


2 


1.3 


2 


1 179 


179 


1 


548 


182 


3.0 


3.02 1 


.04 


.07 


1 


0 . 7 


-3 


0 . 7 


-2 


| 86 


86 * 


1 


526 


156 


3.4 


3.02 1 


.04 


.08 


1 


0 . 7 


-3 


0 . 7 


-3 


| 171 


171 * 


1 


510 


156 


3.3 


3.05 1 


.02 


.08 


1 


1.0 


0 


1.0 


0 


1 2 


2 


1 


482 


156 


3.1 


3 . 07 | 


.00 


.08 


1 


0.8 


-2 


0.8 


-2 


1 4 


4 


1 


322 


104 


3.1 


3 . 07 | 


.00 


.09 


1 


1 . 8 


5 


1 . 8 


5 


1 21 


21 ** 


1 


392 


130 


3.0 


3 . 07 | 


.00 


.09 


1 


0 . 7 


-3 


0 . 7 


-2 


| 23 


23 * 


1 


429 


130 


3.3 


3.06 1 


.00 


.08 


1 


0.6 


-4 


0.6 


-4 


| 180 


180 * 


1 


376 


104 


3.6 


3.111 


-.04 


.09 


1 


0.8 


-2 


0.8 


-2 


1 108 


108 



12 




1 


487 


130 


3.7 


3.12| 


-.04 


.08 


1 


CO 

o 


-1 


CO 

o 


-2 


1 168 


168 


! 


518 


156 


3.3 


3.161 


-.07 


.08 


1 


1.0 


0 


1.0 


0 


1 18 


18 


1 


574 


182 


3.2 


3.161 


-.07 


.07 


1 


1 . 4 


3 


1.3 


2 


| 174 


174** 


1 


346 


104 


3.3 


3.161 


-.08 


.09 




1 . 1 


0 


1.2 


1 


1 140 


140 


1 


430 


130 


3.3 


3.17 I 


-.08 


.08 


I 


1.0 


0 


1.0 


0 


| 177 


111 


1 


478 


156 


3.1 


3.18 


-.09 


.08 


1 


0 . 7 


-3 


0.7 


-3 


| 112 


112* 


1 


594 


156 


3.8 


3.17| 


-.09 


.08 


I 


1.2 


1 


1.2 


1 


130 


130 


1 


447 


130 


3.4 


3.18 I 


-.09 


.08 


1 


0.6 


-3 


0.6 


-3 


| 185 


185* 


1 


193 


52 


3.7 


3.191 


-.10 


.13 


1 


0.6 


-2 


0.6 


-2 


| 38 


38* 


1 


477 


130 


3.7 


3.18 1 


-.10 


.09 




1.0 


0 


1 . 1 


0 


1 93 


93 


1 


441 


130 


3.4 


3.20 


-.10 


.08 


I 


1.0 


0 


1.0 


0 


1 132 


132 


1 


473 


130 


3.6 


3.21| 


-.11 


.08 


1 


0.8 


-1 


0.8 


-1 


I 13 


13 


1 


225 


78 


2.9 


3.21| 


-.12 


. 11 


1 


1.0 


0 


1.0 


0 


1 110 


110 


1 


497 


130 


3.8 


3.22| 


-.12 


.09 


1 


1.3 


2 


1.3 


2 


1 166 


166 


1 


532 


156 


3.4 


3.21| 


-.12 


.08 


1 


0.9 


-1 


0.9 


-1 


1 181 


181 


1 


471 


156 


3.0 


3.23 I 


-.13 


.08 


I 


1.3 


2 


1.3 


2 


I 101 


101 


1 


508 


156 


3.3 


3.25 


-.15 


.08 


1 


1 . 6 


5 


1 . 6 


5 


| 5 


5** 


I 


476 


130 


3.7 


3.25 I 


-.15 


.08 


1 


0.8 


-1 


0.9 


-1 


| 14 


14 


1 


362 


130 


2.8 


3.24| 


-.15 


.09 


1 


1.2 


1 


1 . 1 


1 


1 182 


182 


1 


205 


52 


3.9 


3.28 


-.17 


. 14 




0.6 


-2 


0.6 


-2 


1 42 


42** 


1 


526 


156 


3.4 


3.27| 


-.17 


.08 


1 


0.8 


-2 


0.8 


-1 


1 123 


123 


1 


2045 


598 


3.4 


3.27| 


-.17 


.04 


I 


0.5 


-9 


0.5 


-9 


| 136 


136* 


I 


583 


156 


3.7 


3.30 1 


-.19 


.08 


1 


2.2 


8 


2.2 


8 


| 133 


133** 


1 


654 


182 


3.6 


3.32| 


-.21 


.07 




0.8 


-2 


0.8 


-2 


| 44 


44 


1 


526 


156 


3.4 


3.32| 


-.21 


.08 


I 


0.9 


0 


0.9 


0 


1 49 


49 


1 


554 


182 


3.0 


3.35 1 


-.24 


.07 




0.9 


0 


0.9 


0 


1 15 


15 


! 


490 


156 


3.1 


3.36 1 


-.24 


.08 


1 


0.9 


-1 


0.9 


-1 


1 104 


104 


1 


564 


156 


3.6 


3.39 1 


-.27 


.08 


1 


0 . 7 


-3 


0 . 7 


-2 


| 114 


114* 


1 


499 


156 


3.2 


3.41| 


-.28 


.08 


1 


0.7 


-3 


0 . 7 


-2 


1 56 


56* 


1 


491 


130 


3.8 


3.42| 


-.29 


.09 


1 


0.8 


-1 


0.8 


-1 


1 9 


9 


I 


475 


130 


3.7 


3.42| 


-.29 


.08 


1 


1.0 


0 


1.0 


0 


1 90 


90 


1 


385 


130 


3.0 


3.42| 


-.29 


.08 


1 


1 . 6 


4 


1 . 6 


4 


| 172 


172** 


1 


547 


156 


3.5 


3.44| 


-.31 


.08 


1 


0.5 


-5 


0.5 


-5 


| 66 


66* 


I 


589 


156 


3.8 


3.44| 


-.31 


.08 


! 


0.9 


0 


0.9 


0 


! 95 


95 


1 


546 


182 


3.0 


3.52 1 


-.38 


.08 


1 


1.5 


3 


1.3 


2 


1 25 


25** 


1 


291 


78 


3.7 


3.53 1 


-.39 


.11 


1 


1.5 


2 


1.5 


2 


| 89 


89** 


I 


587 


156 


3.8 


3.53 1 


-.39 


.08 


1 


1 . 6 


4 


1.5 


4 


| 178 


178** 


1 


472 


130 


3.6 


3.56| 


-.41 


.09 


1 


1 . 5 


3 


1.5 


3 


| 167 


167** 


1 


499 


130 


3.8 


3.56| 


-.42 


.09 


1 


1 . 5 


3 


1 . 5 


3 


1 81 


81** 


1 


491 


130 


3.8 


3.57 1 


-.43 


.09 


1 


1.0 


0 


1.0 


0 


| 22 


22 


1 


293 


78 


3.8 


3.59 1 


-.44 


.11 


1 


1 . 5 


2 


1 . 5 


2 


1 7 


7** 


1 


397 


104 


3.8 


3.59 1 


-.44 


.10 


1 


0.7 


-2 


0 . 7 


-2 


| 92 


92* 


1 


547 


130 


4.2 


3.68 1 


-.52 


.09 


1 


0 . 7 


-3 


0 . 6 


-3 


| 141 


141* 


1 


482 


130 


3.7 


3.691 


-.53 


.09 


1 


0 . 7 


-2 


0 . 7 


-2 


| 143 


143* 


1 


298 


104 


2 . 9 


3.77| 


-.60 


.10 


1 


1.3 


1 


1.3 


1 


1 67 


67 


1 


724 


182 


4.0 


3.811 


-.64 


.07 


1 


1 . 1 


0 


1 . 1 


1 


1 20 


20 


1 


186 


52 


3.6 


3.82| 


-.65 


. 14 


1 


0 . 5 


-3 


0 . 5 


-3 


| 169 


169* 


1 


566 


130 


4 . 4 


4.001 


-.82 


.09 


1 


1 . 1 


0 


1 . 1 


0 


1 75 


75 


1 


243 


52 


4.7 


4.17| 


-.98 


.16 


1 


0.9 


0 


0.9 


0 


| 39 


39 


1 


383 


78 


4 . 9 


4.591 


-1.46 


. 14 


1 


0.3 


-5 


0.3 


-5 


| 144 


144* 




Obsvd 


Obsvd 


Obsvd 


Fair | 




Model 


1 


Infit 


Outfit 






1 


Score 


Count 


Average 


Avrage | 


Measure 


S.E. 


| MnSq 


ZStd 


MnSq 


ZStd 


| Num 


raters 


1 


462 . 7 


146 


6 3.2 


3.07 I 


.00 


.08 


i 


1.0 


-0.4 


1.0 


-0.3 


I Mean (Count: 114) 


1 

1 


198.5 


59 


8 0.5 


0.401 


.35 


.02 


i 


0.4 


3.1 


0.3 


3.1 


1 S.D 





RMSE (Model) .09 Adj S.D. .34 Separation 3.89 Reliability .94 

Fixed (all same) chi-square: 1615.2 d.f.: 113 significance: .00 

Random (normal) chi-square: 110.2 d.f.: 112 significance: .53 



Note : * Muted 

** Noise 



The FACETS analysis provides a number of indications of the magnitude of the 



differences among elements of a facet: in this case, in severity among raters. These are 
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the RMSE, Reliability , Separation Index and Fixed (all same) and Random (normal) 
Chi Square, Infit and Outfit statistics. 

Roat Mean Square Standard Error, RMSE, is produced for all non-extreme 
measures over the raters. RMSE score, .09, shows that rater’s error is very low. After 
raters error variance has been adjusted for measurement error, adjusted standard deviation 
found below the 1.0 (.33), thus anr rater score is likely to be with .09 points for each item. 

The Reliability statistics provided by the FACETS analysis indicates the degree to 
which the analysis reliability distinguishes between different levels of quality among the 
elements of the facets (in this case, the different Raters). Table 7.2.1 shows that the 
reliability for raters is .94. This indicates that the analysis is fairly reliably separating 
raters into approximately 4 different levels of leniency and severity. 

The Separation index is the ratio of the corrected standard deviation of elements 
measures (in this case, Raters) to the root mean-square estimation error. If the Raters 
were equally severe, the standard deviation of the Raters difficulty estimates should be 
equal to or smaller than the mean estimation error of the entire data set. However, the 
Rater Seperation Index is 3.87, indicating that the variance among raters is about four 
times the error of estimates. 

Finally, the Fixed Chi-square tests the null hypothesis that all of the elements of the 
facet are equal. The Chi-square of 1615.2 with 113 df. Is significant at p = .00, indication 
that the null hypothesis must be rejected; in other words, the reaters are not equally 
severe. 

The FACETS analysis also provides two measures of fit, or consistency: the infit and 
the outfit scores. The infit is the weighted mean-squared residual that is sensitive to 
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unexpected responses near the point where decisions are being made. Less then 0.6 
indicates muting: too little variation, lack of independence. More than 1.4 indicates noise: 
unmodelled excess variation. On the other hand, the outfit is the the unweighted mean- 
squared residual and is sensitive to extreme scores. This fit statistics has the same form as 
infit, but is the conventional mean-square which is more sensitive to outliers. 

In addition to the mean squares, FACETS provides standardized infit and outfit 
statistics, which have an expected mean of 0 and standard deviation of 1. These statistics 
are useful for comparing the elements of a facet with each other, as they show the degree 
of variability in addition raters’ ratings relative to the amount of variability in the entire 
set. Standardized fit statistics 2 or 3 or less than -2 or -3 are generally signs of misfit. 

Applying these information on the Table 7.2.1, we can see that 23 out of 114 raters 
had either high (>1.4) or low (<0.6) infit statistics and/or either high (>1.4) or low (<.0.6) 
outfit statistics: These readers have high (>1.4) infit/outfit statistics: # 27 , # 50 , # 12 , # 146 , 
# 71 , # 88 , # 54 , # 21 , # 5 , # 133 , # 172 , # 25 , # 89 , # 178 , # 81 , # 167 . Besides these readers, 7 
readers have low infit/outfit statistics (< 0.6). # 118 , # 57 , # 125 , # 66 , # 92 , # 169 , # 144 . 
(Please check the table 7.2.1 for more detail). The low infit and outfit scores are less 
concern than the high infit and outfit scores. The low scores tend to reflect flat lining and 
lack of discrimination. The high infit and outfit scores are probably a concern for raters 
# 350 , #321 and # 133 . Those raters might profit by addition training. It also might br 
prudent to remove these raters from the calibration of the application. 
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Item Analysis 

Item Quality 

A more detailed analysis of items are found in table 7.3.1, the Items 
Measurement Report for the Reading Excellence Act Tutorial Assistance survey results. 
Items are presented in descending order of quality to endorse; in other words, item #18 
and #10 are the most difficult to endorse and items #3 and #4 are the least difficult items 



to endorse. 



Reading Excellence Act Tutorial Assistance 04-30-2000 01:27:25 
Table 7.3.1 items Measurement Report (arranged by mN) . 



Obsvd Obsvd Obsvd Fair | Model | Infit Outfit | 

Score Count Average Avrage | Measure S.E. |MnSq ZStd MnSq ZStd | Nu items 



1 


1643 


643 


2.6 


2.391 


.58 


.04 




1.0 


0 


1.0 


0 


! 18 


18 


I 


1684 


643 


2.6 


2.46! 


. 51 


.04 


I 


1 . 1 


2 


1.2 


2 


! 10 


10 


1 


1755 


643 


2 . 7 


2.59! 


.41 


.04 


1 


0.9 


-1 


1.0 


0 


1 13 


13 


1 


1769 


643 


2.8 


2.61! 


.38 


.04 


! 


1.0 


0 


1.0 


0 


! 8 


8 


1 


1788 


643 


2.8 


2.64! 


.36 


.04 




0.9 


-1 


0.9 


-2 


16 


16 


1 


1794 


643 


2.8 


2.65 


.35 


.04 


1 


1.0 


0 


1.0 


0 


! 21 


21 


1 


1826 


643 


2.8 


2.71! 


.30 


.04 


! 


1.0 


0 


1.0 


0 


! 12 


12 


1 


1823 


643 


2.8 


2.70! 


.30 


.04 


I 


0.8 


-4 


0.8 


-4 


| 17 


17* 


I 


1904 


643 


3.0 


2.851 


.18 


.04 


1 


1 . 7 


9 


1.8 


9 


1 14 


14** 


1 


1926 


643 


3.0 


2.88 ! 


. 15 


.04 


1 


0 . 4 


-9 


0.4 


-9 


| 26 


26** 


1 


1943 


643 


3.0 


2.91! 


.13 


.04 


! 


1.0 


0 


1.0 


0 


! 23 


23 


1 


1961 


643 


3.0 


2.95 


. 10 


.04 


I 


1.0 


0 


1.0 


0 


9 


9 


! 


2009 


643 


3.1 


3.03! 


.03 


.04 


1 


0.9 


-1 


1.0 


0 


| 11 


11 


1 


2018 


643 


3.1 


3.05 


.02 


.04 


1 


1.3 


6 


1.3 


5 


| 22 


22** 


I 


2034 


643 


3.2 


3.07 ! 


-.01 


.04 




0.9 


-1 


0.9 


-1 


20 


20 


1 


2055 


643 


3.2 


3.11 


-.04 


.04 


1 


0.9 


0 


1.0 


0 


! 2 


2 


1 


2090 


643 


3.3 


3.17| 


-.09 


.04 




0.9 


-1 


0.9 


-1 


1 15 


15 


1 


2102 


643 


3.3 


3.19! 


-.10 


.04 


1 


1.0 


0 


0.9 


-1 


! 19 


19 


1 


2108 


643 


3.3 


3.21 ! 


-.11 


.04 




1.0 


0 


1.0 


0 


! 24 


24 


1 


2195 


643 


3.4 


3.36! 


-.24 


.04 


! 


1 . 1 


1 


1.0 


0 


1 25 


25 


1 


2201 


643 


3.4 


3.37 ! 


-.25 


.04 


! 


1.0 


0 


1.0 


0 


1 7 


7 


I 


2243 


643 


3.5 


3.44 ! 


-.31 


.04 


! 


0.9 


-2 


0.9 


-2 


! 6 


6 


1 


2332 


643 


3.6 


3.60 1 


-.45 


.04 


1 


1.0 


0 


1.0 


0 


I 5 


5 


1 


2365 


643 


3.7 


3.65| 


-.50 


.04 


1 


1.3 


5 


1.3 


4 


| 1 


1** 


1 


2464 


643 


3.8 


3.82 ! 


-.65 


.04 


! 


0.8 


-3 


0.8 


-3 


! 4 


4 


1 


2716 


643 


4.2 


4.24| 


-1.06 


.04 


! 


0.9 


-1 


0.9 


-1 


1 3 


3 


1 


Obsvd 


Obsvd 


Obsvd 


Fair ! 




Model 


1 


Infit 


Outfit 






1 


Score 


Count 


Average 


Avrage | 


Measure 


S.E. 


| MnSq 


ZStd 


MnSq 


ZStd 


| Nu 


items 


I 


2028 . 8 


643 


0 3.2 


3.06! 


.00 


.04 


! 


1.0 


-0.3 


1.0 


-0.2 


! Mean (Count: 26) 


1 


250 . 5 


0 


0 0.4 


0.44| 


.38 


.00 


1 


0.2 


3.4 


0.2 


3.3 


1 S.D. 



RMSE (Model) .04 Adj S.D. .37 Separation 9.66 Reliability .99 
Fixed (all same) chi-square: 2324.2 d.f.: 25 significance: .00 

Random (normal) chi-square: 25.0 d.f.: 24 significance: .41 



* : Muting 

** : Noise 
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As we pointed out before, The FACETS analysis provides a number of 
indications of the magnitude of the differences among elements of a facet: in this case, in 
quality of the items. These are the RMSE, Reliability , Separation Index and Fixed (all 
same) and Random (normal) Chi Square , Infit and Outfit statistics. 

Roat Mean Square Standard Error, RMSE, is produced for all non-extreme 
measures over the reaters. RMSE score, .04, shows that item quality error is very low. 
After item quality error variance has been adjusted for measurement error, adjusted 
standard deviation found below the 1.0 (.37, respectively). 

The Reliability statistics is Rash equivalent to the KR-20 or Cronbach Alpha 
statistics, which is the ratio of “True variance” to “Observed variance”. Reliability 
provided by the FACETS shows how different the measures are, which may or may not 
indicate how “good” the test is. High (near 1.0) item reliabilities are preferred. In this 
case, the reliability is .99, indicating that the analysis is very reliably separating items 
into different levels of difficulty. 

The Separation index is the ratio of the corrected standard deviation of elements 
measures (in this case, items) to the root mean-square estimation error. If the Items were 
equal difficulty, the standard deviation of the item quality estimates should be equal to or 
smaller than the mean estimation error of the entire data set. However, the Item 
Seperation Index is 9.66, indicating that the variance among items is about ten times the 
error of estimates. 

Finally, the Fixed Chi-square tests the null hypothesis that all of the elements of the 
facet are equal. The Chi-square of 2324.3 with 25 df. is significant at p = .00, indication 
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that the null hypothesis must be rejected; in other words, the items are not equal 
difficulty. 

The FACETS analysis also provides two measures of fit, or consistency: the infit 
and the outfit. The infit is the weighted mean-squared residual that is sensitive to 
unexpected responses near the point where decisions are being made. Less then 0.6 
indicates muting: too little variation, lack of independence. More than 1.4 indicates noise: 
unmodelled excess variation. On the other hand, the outfit is the the unweighted mean- 
squared residual and is sensitive to extreme scores. This fit statistics has the same form as 
infit, but is the conventional mean-square which is more sensitive to outliers. 

In addition to the mean squares, FACETS provides standardized infit and outfit 
statistics, which have an expected mean of 0 and standard deviation of 1. These statistics 
are useful for comparing the elements of a facet with each other, as they show the degree 
of variability in addition items’ ratings relative to the amount of variability in the entire 
set. Standardized fit statistics 2 or 3 or less than -2 or -3 are generally signs of misfit. 

Applying these information on the Table 7.3.1, we can see that 1 out of 26 items had 
either high (>1.4) or low (< 0.6) infit/outfit statistics: Item # 14 . These statistics indicate 
that this item was not consistent with the estimated ability measures of the applications, 
and that the scores for this item may not be stable. 

Application/Rater Interaction 

A z score above 2.0 or below -2.0 would indicate an interaction effect. According 
to the Bias/Interaction report in FACETS analysis in Table 13.1.1, there were several 
raters who seemed to be too lenient or too severe on certain applications. Z scores in this 
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bias analysis ranged form - 8.14 to 6.0. many of this interactions effect came from the 



calibration (#3) application (used in the process of training the raters). 



Reading Excellence Act Tutorial Assistance 04-30-2000 01:27:25 
Table 13.1.1 Bias/ Interaction Calibration Report (arranged by mN) . 

Bias/Interaction analysis specified by Model: ?B, ?B, ?, RATINGS 



1 

1 


Obsvd Exp. 
Score Score 


Obsvd 

Count 


Obs-Exp | 
Average | 


Bias+ Model 
Measure S.E 


Unfit Outfit | 
Z-Score | MnSq MnSq | Sq 


Num app 


measr Num rate 


measr 


1 


31 


46.2 


26 


-.591 


1.43 


. 45 


3.22 


I 


1 . 1 


1 . 1 


I 


294 


3101 


3101 


-1.68 


71 


71 


13 




1 


58 


92.7 


26 


- 1 . 33 | 


1.20 


.20 


6.00 


1 


1.0 


0.9 


1 


222 


3 


3 


.00 


52 


52 


08 


1 


1 


49 


76.0 


26 


-1.04| 


1 . 04 


.22 


4 . 65 


1 


0.5 


0.6 


1 


217 


3 


3 


.00 


51 


51 


63 






34 


47.2 


26 


-.511 


1 . 02 


.35 


2.88 




0.7 


0.6 




48 


3098 


3098 


-1.16 


10 


10 


60 






59 


88.3 


26 


-1.131 


1.01 


.20 


5.11 


1 


0.6 


0.6 


1 


357 


3016 


3016 


-.52 


90 


90 - 


29 






34 


45.7 


26 


-.451 


. 94 


.35 


2.65 




0.6 


0.7 


1 


86 


3101 


3101 


-1.68 


16 


16 


16 




1 


59 


84 . 5 


26 


-.98 1 


.88 


.20 


4 .46 


I 


0.9 


0.9 


I 


566 


3064 


3064 


-.40 


168 


168 - 


04 




1 


36 


48.8 


26 


-.491 


.87 


.32 


2 . 74 


1 


0.7 


0.8 


1 


80 


3037 


3037 


-1.52 


16 


16 


16 




1 


34 


44 . 5 


26 


-.40 1 


.86 


.35 


2 . 44 


I 


1 . 4 


1 . 1 


1 


137 


3087 


3087 


-2.29 


25 


25 - 


38 




1 


56 


79.5 


26 


-.90 1 


.84 


.20 


4 . 12 


1 


1 . 7 


1.8 


1 


146 


3025 


3025 


-.04 


27 


27 


48 






47 


66.7 


26 


-.761 


.82 


.23 


3.56 


1 


1 . 7 


1.8 


1 


163 


3106 


3106 


-.79 


28 


28 


17 


| 


1 


51 


72.2 


26 


-,82| 


.81 


.22 


3.77 




0.4 


0.4 


1 


156 


3067 


3067 


-.60 


28 


28 


17 






78 


101.6 


26 


-.911 


.81 


.18 


4 . 43 




0.2 


0.2 


1 


511 


3028 


3028 


.07 


136 


136 - 


17 




1 


61 


84 . 4 


26 


-.90 1 


.80 


.19 


4.13 


I 


0.5 


0.5 


1 


158 


3073 


3073 


-.19 


28 


28 


17 






51 


71 . 9 


26 


-.80 1 


.80 


.22 


3.72 




1.2 


1 . 1 


1 


547 


3067 


3067 


-.60 


146 


146 


18 




1 


71 


94 . 1 


26 


-.891 


. 77 


.18 


4 .19 




0.2 


0.2 


1 


179 


3107 


3107 


.35 


40 


40 


38 




1 


64 


86.8 


26 


-.88 1 


. 77 


.19 


4 . 05 


1 


0.3 


0.3 


1 


461 


3018 


3018 


.03 


122 


122 


30 






76 


98.2 


26 


-.851 


. 75 


.18 


4 . 10 


I 


0.4 


0.4 


1 


59 


3 


3 


.00 


13 


13 - 


11 




1 


52 


71 . 8 


26 


-.761 


. 75 


.21 


3.53 


1 


1.3 


1.3 


1 


300 


3074 


3074 


-.48 


73 


73 


30 






57 


78.2 


26 


-,82| 


. 75 


.20 


3.75 




0.8 


0.8 


1 


445 


3 


3 


.00 


115 


115 


56 




1 


62 


83.3 


26 


-,82| 


.73 


.19 


3.77 


I 


0.7 


0.7 


1 


49 


3 


3 


.00 


11 


11 


39 






55 


75.1 


26 


-.77| 


.73 


.21 


3.57 




1.3 


1.3 


1 


168 


3052 


3052 


-.55 


30 


30 


11 




1 


69 


90.7 


26 


-,84| 


.73 


.19 


3.90 


1 


0.3 


0.3 


1 


450 


3 


3 


.00 


116 


116 


14 






74 


95.5 


26 


-.831 


. 72 


.18 


3.93 




0.5 


0.5 


1 


117 


3001 


3001 


-.41 


22 


22 - 


43 






60 


80.9 


26 


-.80 1 


. 72 


.20 


3.70 


1 


0.6 


0.5 


1 


151 


3095 


3095 


.01 


27 


27 


48 




1 


53 


72 . 1 


26 


-.731 


. 72 


.21 


3.41 


1 


1.3 


1.3 


1 


631 


3071 


3071 


-.88 


181 


181 - 


12 






43 


57 . 4 


26 


-.55| 


.71 


.25 


2 . 81 


I 


1 . 5 


1.2 


1 


212 


3023 


3023 


-.98 


50 


50 


32 




1 


86 


106.0 


26 


-.77| 


.71 


.18 


3.87 


1 


0.4 


0.5 


1 


526 


3047 


3047 


.33 


140 


140 - 


08 






93 


111 . 1 


26 


-.70 1 


. 68 


.19 


3.65 


1 


1 . 1 


1 . 1 


1 


321 


3070 


3070 


.19 


81 


81 - 


42 




1 


70 


90.2 


26 


-.78 1 


. 67 


.19 


3.64 


1 


1 . 1 


1 . 1 


1 


189 


3 


3 


.00 


45 


45 


16 






89 


107 . 7 


26 


-.72| 


. 67 


.18 


3.67 




0.6 


0.7 


1 


368 


3078 


3078 


.37 


93 


93 - 


10 




1 


52 


69.3 


26 


-.66 1 


. 67 


.21 


3.13 


I 


0.8 


0.8 


I 


431 


3023 


3023 


-.98 


110 


110 - 


12 




1 


51 


67.9 


26 


-.65 1 


. 66 


.22 


3.08 




0.4 


0.4 


1 


157 


3068 


3068 


-.74 


28 


28 


17 




1 


99 


115.6 


26 


-,64| 


. 66 


.19 


3.47 


1 


1.2 


1.2 


1 


481 


3029 


3029 


. 92 


129 


129 


12 




1 


91 


109.2 


26 


-.70 1 


. 66 


.18 


3.61 


1 


0.4 


0.4 


1 


533 


3 


3 


.00 


143 


143 - 


53 




1 


86 


104 . 5 


26 


-,71| 


. 65 


.18 


3.57 


1 


1.0 


1.0 


1 


196 


3062 


3062 


. 51 


45 


45 


16 




1 


78 


97.2 


26 


-.74 I 


. 64 


.18 


3.54 


1 


0.5 


0.5 


I 


524 


3 


3 


.00 


140 


140 - 


08 






44 


57 . 0 


26 


-.50 1 


. 63 


.25 


2.56 




1.2 


1.3 




234 


3055 


3055 


-1.24 


54 


54 


08 




1 


77 


95.9 


26 


-.73 1 


. 63 


.18 


3.49 




0.2 


0.2 


I 


503 


3006 


3006 


-.13 


136 


136 - 


17 






74 


92.9 


26 


-.73 1 


. 63 


.18 


3.44 




0.5 


0.5 




570 


3 


3 


.00 


170 


170 


07 






75 


93.9 


26 


-.731 


. 63 


.18 


3.45 


1 


0.4 


0.4 


1 


576 


3 


3 


.00 


171 


171 


04 






73 


91.6 


26 


-.72| 


. 62 


.18 


3.38 


1 


0.8 


0.8 


1 


618 


3013 


3013 


-.05 


179 


179 


07 






93 


109.5 


26 


-.63 1 


. 61 


.19 


3.28 


I 


0.3 


0.3 


1 


505 


3014 


3014 


.37 


136 


136 - 


17 




1 


86 


103.1 


26 


-.66 1 


. 60 


.18 


3.27 




0.8 


0.8 


1 


38 


3 


3 


.00 


9 


9 


29 




1 


54 


70.0 


26 


-.611 


. 60 


.21 


2.90 


1 


0.4 


0.4 


1 


220 


3057 


3057 


-.20 


51 


51 


63 




1 


88 


104 . 9 


26 


-.65 1 


. 60 


.18 


3.27 


I 


0.4 


0.4 


1 


447 


3029 


3029 


. 92 


115 


115 


56 






72 


89.4 


26 


-,67| 


.58 


.18 


3.14 




1 . 1 


1 . 1 




54 


3 


3 


.00 


12 


12 


19 






80 


97.2 


26 


-.66 1 


.58 


.18 


3.19 


1 


0.2 


0.2 


1 


508 


3020 


3020 


-.09 


136 


136 - 


17 






45 


57.2 


26 


-,47| 


.58 


.24 


2 .41 


1 


1 . 1 


1 . 1 


1 


552 


3104 


3104 


-1.13 


146 


146 


18 






62 


78 . 5 


26 


-.63 1 


. 57 


.19 


2 . 94 




2.2 


2 . 1 


1 


114 


3052 


3052 


-.55 


21 


21 


00 


1 


1 


47 


59.5 


26 


-.48 1 


.56 


.23 


2 .41 


1 


0.8 


0.8 


1 


235 


3105 


3105 


-1 . 14 


54 


54 


08 




1 


100 


113.9 


26 


-.531 


. 55 


.19 


2 . 87 




0.2 


0.2 




510 


3026 


3026 


.56 


136 


136 - 


17 




1 


74 


90.2 


26 


-,62| 


. 54 


.18 


2.95 
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Fixed (all = 0) chi-square: 2484.3 d.f.: 643 significance: .00 



For example, rater #16 with an expected score of 73.4 had an observed score of 122 



on application #3052, translating into a z-score of - 8.14. Rater #52 with an expected 
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score 92.7 had an observed score 58 on application #0003, translating into a z-score of 

6 . 00 . 

2 

There was an overall statistically significant rater by application interaction effect % 
=2484.3, df. = 643, p <.01) 

Summary 

The FACETS analysis provided an assessment of quality in the applications, 
raters, and items. The specific indications of quality are Separation Index, Reliability, 
RMSE, and Fixed and Random Chi-Square. The FACETS analysis also provided two 
measures of fit, or consistency on each of the three facets: the infit and outfit. The infit is 
the weighted mean-squared residual that is sensitive to unexpected responses within 
expected response parameters. On the other hand, the outfit is the unweighted mean- 
squared residual and is sensitive to extreme scores. 

In this analysis, the results showed that just 21% of the applications have 
acceptable quality. 11 out of 106 applications show very high infit statistics (3 > Infit > - 
3) and 5 out of 106 applications showed very high outfit statistics (.06> outfit > -3). 
These statistics indicated that most applications were consistent with the estimated 
quality measures, or their scores were highly predictable of the 11 applications with high 
infit/outfit, only 3 showed cause for concern. 

For raters, results showed that 23 out of 1 14 raters have been found with high infit 
and outfit statistics. Sixteen of the raters had high infit/outfit statistics, but only 3 raters 
were problematic. None of the raters with low infit/outfit statistics were problematic. 
These statistics indicated that these raters’ ordering of application was generally 
consistent with the estimated quality measures of the applications. 
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Finally, only 1 out of 26 items showed high infit and outfit statistics. This result 
indicated that this item may contribute minor noise to the overall calibration of the 
application. In general, the item functions are very well. 

The Facet analysis also provides Root Mean Square Standard Error (RMSE) for 
all non-extreme measures over applications, raters and items (.09, .09 & .04 respectively). 
These RMSE scores illustrate that applications, rater and item measurement errors are 
very low. After application, raters and item variances have been adjusted for 
measurement error, three variances are below the 1.0 (Adj. SD = .67, Adj. SD = .33, Adj. 
SD = .37 for application, raters and items respectively). The ratio of Ad. SD to RMSE 
(7.70 for application, 3.89 for raters and 9.66 for items) for application raters and items 
separation are relatively high due to low RMSEs, indicating high calibration and low 
error. 

The reliability statistics provided by the FACETS analysis indicates the degree to 
which the analysis reliability distinguishes level of quality among the elements of the 
application, raters and items. For applications, raters and items, FACETS analysis 
produce .98, .94 and .99, reliability scores respectively. These reliability scores indicate 
that the analysis is fairly reliably separating applications, raters and items into different 
levels of quality. 

Conclusion 

This study demonstrated the application of a sophisticated assessment procedure 
in addressing a significant educational problem, i.e., a fair and consistent way to assess 
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applications into a Reading Excellence program. This procedure has wide applicability, 



but is currently not well known. 
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